AITopics | logistic regression

Collaborating Authors

logistic regression

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Few-shot Cross-country Generalization of Tabular Machine Learning and Foundation Models for Childhood Anemia Prediction under Distribution Shift

Brima, Yusuf, Atemkeng, Marcellin, Kallon, Lansana Hassim, Niyukuri, David, Vacavant, Antoine, Saidu, Samuel, Chen, Ding-Geng

arXiv.org Machine LearningMay-27-2026

Background Childhood Anemia affects an estimated 40% of children aged 6-59 months globally and arises from heterogeneous nutritional, infectious, and socioeconomic factors that vary substantially across settings. This variability challenges the generalizability of predictive machine learning models, which often degrade under cross-population or temporal shifts. We investigated the utility a modern transformer-based tabular foundation model (TabPFN) as a complementatry framework with respect to supervised classical machine learning methods across diverse country contexts, with particular attention to data-scarce settings where surveillance capacity is most limited. Methods We conducted a multi-country prediction study using Demographic and Health Surveys (DHS) children's recode data from 16 countries spanning Africa, Asia, Latin America, the Caucasus, and the Middle East. The harmonized analytic cohort comprised of (n = 68,856)children aged 6-59 months with valid hemoglobin measurements. Anemia was defined using WHO age and altitude-adjusted thresholds and treated as a binary outcome. We trained Logistic Regression, XGBoost, and LightGBM models using standard supervised learning, and evaluated TabPFN v2.6 in an in-context learning setting. Performance was assessed using Area Under the Receiver Operating Characteristic Curve (AUC-ROC) and other standard classification metrics, with calibration evaluated via Brier score and expected calibration error (ECE). Uncertainty in performance estimates was quantified using bootstrap resampling to derive 95% confidence intervals. Robustness was assessed in a few-shot learning setting. Cross-population generalization was examined using leave-one-country-out (LOCO) validation and reverse-LOCO experiments to assess directional transferability. Subgroup analyses were conducted across five demographic strata: child age group, sex, maternal education, residence type, and household wealth quintile. Feature importance was assessed using standard linear and tree-based explainer SHAP values for the three supervised models and an adapted version of SHAP for TabPFN, aggregated across countries and examined at the country level. TabPFN also yielded the best probabilistic calibration across all 16 countries, achieving the lowest mean Brier score (0.203) and Expected Calibration Error (ECE = 0.042) of all models evaluated; LightGBM and Logistic Regression exhibited the greatest miscalibration, particularly at higher predicted probabilities. Under full-data conditions, within-country discrimination was moderate across all models (AUC-ROC 0.59-0.76) Under LOCO validation, performance declined modestly (AUC-ROC 0.58-0.69) Reverse-LOCO analyses revealed asymmetric and directional transferability, with epidemiologically diverse populations serving as more informative training sources and certain target populations remaining persistently difficult to predict regardless of model or training data.

artificial intelligence, machine learning, predictor, (17 more...)

arXiv.org Machine Learning

2605.26589

Country:

Asia > Middle East (0.34)
North America > United States (0.28)
Europe > Middle East (0.24)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Hematology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

From Sequential Nodes to GPU Batches: Parallel Branch and Bound for Optimal $k$-Sparse GLMs

Liu, Jiachang, Lodi, Andrea

arXiv.org Machine LearningMay-22-2026

GPUs have significantly accelerated first-order methods for large-scale optimization, especially in continuous optimization. However, this success has not transferred cleanly to problems with discrete variables, combinatorial structure, and nonlinear objectives, such as certifying optimal solutions for cardinality-constrained generalized linear models. Major challenges include the sequential processing of heterogeneous nodes in branch and bound (BnB) and frequent data movement between the CPU and GPU. We propose a simple, generic, and modular CPU--GPU framework that processes multiple BnB nodes in batches on GPUs. The framework is built around a small set of GPU-efficient routines and uses padding together with lightweight custom kernels to handle irregular node data structures. Experiments show one to two orders of magnitude speedups and zero optimality gap on challenging instances. The framework can also be extended to collect the entire Rashomon set, enabling downstream statistical analysis such as variable-importance analysis and model selection under secondary user-specific measures (e.g., AUC in classification).

artificial intelligence, machine learning, regression, (15 more...)

arXiv.org Machine Learning

2605.22188

Country: North America > United States (0.28)

Genre: Research Report (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization

Cui, Ying, Li, Albert M, Charu, Vivek, Hwang, Yeon-Mi, Hernandez-Boussard, Tina, Tian, Lu

arXiv.org Machine LearningMay-20-2026

Many clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in the resulting prediction model. Such risk scores are often derived by first fitting a regression model and then rounding the estimated coefficients to the nearest integer after appropriate scaling. This approach is computationally fast but does not guarantee optimality of the resulting score. Alternatively, one may search over all possible integer weights to directly optimize a value function by posing the problem as an integer programming task. However, the associated computational burden can be substantial, especially when the value function is nonconcave or even discontinuous. In this paper, we develop new machine learning algorithms that employ a flexible greedy optimization strategy to learn such additive scoring directly under explicit and sensible optimality objectives. We apply the proposed method to a large electronic health record (EHR) cohort in Epic Cosmos to construct an integer-weighted comorbidity score for measuring the risk of post-discharge mortality. We also conduct a simulation study to examine the finite-sample operating characteristics.

artificial intelligence, machine learning, predictor, (16 more...)

arXiv.org Machine Learning

2605.19113

Country: North America (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Increasing Missingness to Reduce Bias: Richardson-SGD with Missing Data

Genans, Ferdinand, Scornet, Erwan

arXiv.org Machine LearningMay-20-2026

Stochastic gradient methods are central to modern large-scale learning, but their use with incomplete covariates remains delicate since imputation schemes generally introduce systematic gradient biases, as shown for linear models. In this work, we prove that all parametric models exhibit similar gradient bias for various imputation procedures and characterize exactly the dependence on the missingness ratio vector $p$, with $O(\|p\|)$ as the leading term. We exploit this analysis to propose a simple debiasing procedure for stochastic gradient descent (SGD) with missing values based on Richardson extrapolation, which leverages the exact expression of the gradient bias. The key idea is to \emph{deliberately add missingness}: from an already incomplete observation, we generate a further-thinned version at a higher, controlled missingness level, and combine the two resulting stochastic gradients to cancel the leading bias term. We prove that one Richardson step reduces the gradient bias from $O(\|p\|)$ to $O(\|p\|^2)$ under several missingness scenarios. Our proposed method is computationally efficient, model-agnostic and applies to any parametric loss whose stochastic gradient can be computed after imputation. Furthermore, when missing indicators are independent, the population gradient bias is a multilinear polynomial in $p$ and depends only on population gradient errors induced by declaring a single coordinate missing. In this case, our method generalizes to a multi-step Richardson procedure which recursively cancels higher-order terms. Empirically, Richardson debiasing improves optimization and estimation across several generalized linear models and combines positively with widely used imputation procedures such as MICE. These results suggest that, somewhat counter-intuitively, adding controlled missingness on top of existing missing data can make stochastic learning from incomplete data more accurate.

artificial intelligence, machine learning, regression, (17 more...)

arXiv.org Machine Learning

2605.19641

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Bayesian inference with sources of uncertainty: from confidence modelling to sparse estimation

Rosa, Rafael Mouallem, Arbel, Julyan, Nguyen, Hien Duy

arXiv.org Machine LearningMay-6-2026

We introduce a general framework that extends Bayesian inference by allowing the researcher to explicitly encode confidence in each source of uncertainty within the model. This mechanism provides a new handle for model design and regularisation control. Building on this framework, we develop a general approach for inducing sparsity in statistical models and illustrate its use in linear and logistic regression, as well as in Bayesian neural networks.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2605.03134

Country:

Europe (0.28)
Asia > Japan (0.28)

Genre: Research Report (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Multinomial Logistic Regression: Asymptotic Normality on Null Covariates in High-Dimensions

Neural Information Processing SystemsMay-1-2026, 04:56:55 GMT

This paper investigates the asymptotic distribution of the maximum-likelihood estimate (MLE) in multinomial logistic models in the high-dimensional regime where dimension and sample size are of the same order. While classical largesample theory provides asymptotic normality of the MLE under certain conditions, such classical results are expected to fail in high-dimensions as documented for the binary logistic case in the seminal work of Sur and Candès [2019]. We address this issue in classification problems with 3 or more classes, by developing asymptotic normality and asymptotic chi-square results for the multinomial logistic MLE (also known as cross-entropy minimizer) on null covariates. Our theory leads to a new methodology to test the significance of a given feature. Extensive simulation studies on synthetic data corroborate these asymptotic results and confirm the validity of proposed p-values for testing the significance of a given feature.

artificial intelligence, machine learning, theorem 2, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

New Bounds for Hyperparameter Tuning of Regression Problems Across Instances

Neural Information Processing SystemsApr-30-2026, 10:08:17 GMT

The task of tuning regularization coefficients in regularized regression models with provable guarantees across problem instances still poses a significant challenge in the literature. This paper investigates the sample complexity of tuning regularization parameters in linear and logistic regressions under ℓ1 and ℓ2-constraints in the data-driven setting. For the linear regression problem, by more carefully exploiting the structure of the dual function class, we provide a new upper bound for the pseudo-dimension of the validation loss function class, which significantly improves the best-known results on the problem. Remarkably, we also instantiate the first matching lower bound, proving our results are tight. For tuning the regularization parameters of logistic regression, we introduce a new approach to studying the learning guarantee via an approximation of the validation loss function class. We examine the pseudo-dimension of the approximation class and construct a uniform error bound between the validation loss function class and its approximation, which allows us to instantiate the first learning guarantee for the problem of tuning logistic regression regularization coefficients.

artificial intelligence, function class, machine learning, (16 more...)

Neural Information Processing Systems

Country: